-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch stop/start/restart #1990
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this has the major components in place and needs some testing and tweaking. Other notes
.travis/secrets.tar.enc
change shouldn't be in this PR- We want to get rid of
restart-elasticsearch
, and I think there's no reason not to do it in the same PR and just make sure it's well tested
@@ -6,26 +6,46 @@ | |||
retries: 20 | |||
delay: 3 | |||
changed_when: result.stdout.find('"acknowledged":true') != -1 | |||
tags: action_stop | |||
|
|||
- debug: msg="{{ result }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to keep this debug output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
- name: get es instances to kill | ||
shell: "ps aux | pgrep 'elasticsearc[h]' | awk '{print $2}'" | ||
register: es_instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might call this es_pids
rather than es_instances
since that's more specifically what they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
- name: kill elasticsearch instances | ||
shell: "pkill {{item}}" | ||
with_items: es_instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? I would have though that you would have to use es_instances.stdout.splitlines()
or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if action == 'status': | ||
return ElasticsearchClassic(self.environment, self.ansible_context).execute_action(action, host_pattern, process_pattern) | ||
else: | ||
response = six.moves.input("This function does more than stop and start the elasticsearch service. For that, use elasticsearch-classic." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: these lines are long. I'd suggest just
response = six.moves.input(
"this style"
"\nof indent"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"\nStop will: stop pillows, stop es, and kill -9 if any processes still exist after a period of time. " | ||
"\nStart will start pillows and start elasticsearch " | ||
"\nRestart is a stop followed by a start.\n Continue? [y/n]: ") | ||
if response.lower() == "n": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use ask
for this the way other commands in commcare-cloud do for a consistent experience (and consistent code). E.g.
if not ask('Have you stopped all the elastic pillows?', strict=True, quiet=args.quiet): |
ask
is just a little utility function we wrote, not a common python thing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know
bed0323
if not exit_code == 0: | ||
print("ERROR while starting pillows. Exiting.") | ||
sys.exit(1) | ||
self._run_rolling_restart_yml(tags='action_start') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like elasticsearch is being started after pillows are started if i'm reading this correctly. we probably want to start elasticsearch first, and then start pillows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we want to do it in that order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most pillows populate data into elasticsearch, so for the time between the pillows starting and es starting, the pillows will just be creating errors. they will retry eventually but there's no point to starting a service if the service it depends on is down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, done here: 9a8c2ae
(also refactored the pillow start/stop code in there)
This is looking great! I think next step is to run on staging (if you haven't) in a few different ways (restart, stop, start, etc.) and tell us the outcome. Then I think it's good to merge! |
def _act_on_pillows(self, action): | ||
# Used to stop or start pillows | ||
ansible_context = AnsibleContext(None) | ||
service = SERVICES_BY_NAME['pillowtop'](self.environment, ansible_context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to reference it directly rather than looking it up in the dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, ba1d401
Just tested it on staging and confirms that it works as expected. Will merge after the tests pass unless anyone has any objections! |
🎉 |
#1738